Inter-annotator Agreement for Dependency Annotation of Learner Language
نویسندگان
چکیده
This paper reports on a study of interannotator agreement (IAA) for a dependency annotation scheme designed for learner English. Reliably-annotated learner corpora are a necessary step for the development of POS tagging and parsing of learner language. In our study, three annotators marked several layers of annotation over different levels of learner texts, and they were able to obtain generally high agreement, especially after discussing the disagreements among themselves, without researcher intervention, illustrating the feasibility of the scheme. We pinpoint some of the problems in obtaining full agreement, including annotation scheme vagueness for certain learner innovations, interface design issues, and difficult syntactic constructions. In the process, we also develop ways to calculate agreements for sets of dependencies.
منابع مشابه
CzeSL – an error tagged corpus of Czech as a second language
Using an error-annotated learner corpus as the basis, the goal of this paper is two-fold: (i) to evaluate the practicality of the annotation scheme by computing inter-annotator agreement on a non-trivial sample of data, and (ii) to find out whether the application of automated linguistic annotation tools (taggers, spell checkers and grammar checkers) on the learner text is viable as a substitut...
متن کاملAnnotators' Certainty and Disagreements in Coreference and Bridging Annotation in Prague Dependency Treebank
In this paper, we present the results of the parallel Czech coreference and bridging annotation in the Prague Dependency Treebank 2.0. The annotation is carried out on dependency trees (on the tectogrammatical layer). We describe the inter-annotator agreement measurement, classify and analyse the most common types of annotators’ disagreement. On two selected long texts, we asked the annotators ...
متن کاملTowards Universal Dependencies for Learner Chinese
We propose an annotation scheme for learner Chinese in the Universal Dependencies (UD) framework. The schemewas adapted from a UD scheme for Mandarin Chinese to take interlanguage characteristics into account. We applied the scheme to a set of 100 sentenceswritten by learners of Chinese as a foreign language, and we report inter-annotator agreement on syntactic annotation.
متن کاملBuilding a learner corpus
The paper describes a corpus of texts produced by non-native speakers of Czech. We discuss its annotation scheme, consisting of three interlinked levels to cope with a wide range of error types present in the input. Each level corrects different types of errors; links between the levels allow capturing errors in word order and complex discontinuous expressions. Errors are not only corrected, bu...
متن کاملPhrase Structure Annotation and Parsing for Learner English
There has been almost no work on phrase structure annotation and parsing specially designed for learner English despite the fact that they are useful for representing the structural characteristics of learner English. To address this problem, in this paper, we first propose a phrase structure annotation scheme for learner English and annotate two different learner corpora using it. Second, we s...
متن کاملذخیره در منابع من
با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید
عنوان ژورنال:
دوره شماره
صفحات -
تاریخ انتشار 2013